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SUMMARY 


In  this  paper,  a  general  class  o!  stochastic  estimation 
and  control  problems  is  formulated  from  the  Bayesian  De¬ 
cision-Theoretic  viewpoint.  A  discussion  as  to  how  these 
problems  can  be  solved  step-by-step  in  principle  and  prac¬ 
tice  from  this  approach  is  presented.  As  a  specific  example, 
the  closed  form  Wiener- Kalman  solution  for  linear  estima¬ 
tion  in  gausslan  noise  is  derived.  The  purpose  of  the  paper 
is  to  show  that  the  Bayesian  approach  provides:  (i)  a  general 
unifying  framework  within  which  to  pursue  further  research¬ 
es  in  stochastic  estimation  and  control  problems,  (11)  the 
necessary  computations  and  difficulties  that  must  be  over¬ 
come  for  these  problems.  An  example  of  nonlinear,  non- 
gausslan  estimation  problem  is  also  solved. 

SINGLE  STAGE  ESTIMATION  PROBLEM 

For  purpose  of  illustrating  the  concepts  Involved,  the 
single  stage  estimation  problem  will  be  discussed  first. 

Once  this  is  accomplished,  the  multistage  problem  can  be 
treated  straightforwardly. 

Problem  Statement 

The  following  Information  is  assumed  given  - 

(i)  A  set  of  measurements  z.,  z - . z.  which  are 

denoted  by  the  vector  z.  1  * 

(11)  The  physical  relationship  between  the  state  of 

nature  which  is  to  be  estimated  and  the  measure¬ 
ments.  This  is  given  by 

z  ■’  g(x,  v)  (1) 

where  z  is  the  measurement  vector  (k  x  1) 
x  is  the  state  (signal)  vector  (n  x  1) 
v  Is  the  noise  vector  (q  x  1) 

(ill)  The  Joint  density  function  p(x,  v): 

From  this  one  readily  obtains  the  respective  mar¬ 
ginal  density  functions,  p(x)  and  p(v). 

It  is  assumed  that  information  for  (Hi)  is  available  in 
analytical  form  or  can  be  approximated  by  analytical  dis¬ 
tributions.  Item  (II)  can  be  either  »n  closed  form  or  merely 
computable.  The  problem  is  to  obtain  an  estimate  ft  of  x 
and  which  base  upon  the  measurements  is  best  in  some 
sense  to  be  defined  later. 


*The  work  reported  in  this  paper  is  supported  in  part  by 
NO  NR  Contract  (18M)(16)  at  Harvard  University  and  by 
Minneapolis- Honeywell  Regulator  Company,  Aeronautical 
Division,  Boston,  Massachusetts. 

••Consultant  at  Mlnneapolls-Honeywell  Regulator  Company. 


The  Bayejian  Solution 

The  Bayesian  solution  to  the  above  problem  now  pro¬ 
ceeds  via  the  following  steps 

(1)  Evaluate  p(z)  -  This  can  be  done  analytically,  at 
least  in  principle,  or  experimentally  by  Monte 
Carlo  methods  since  z  »  g(x,  v)  and  p(x,  v)  are 
given.  In  the  latter  case,  we  assume  it  is  possible 
to  fit  the  experimental  distribution  again  by  a  mem¬ 
ber  of  a  family  of  distributions. 

(ii)  At  this  point,  two  alternatives  are  possible,  one 
may  be  superior  to  the  other  dependent  on  the 
nature  of  the  problem. 

a)  Evaluate  p(x,  z)  -  This  Is  possible  analytically  If 

v  is  of  the  same  dimension  as  z  and  one  can  ob¬ 
tain  the  functional  relationship  v  •  «)  from 

(1)  above.  Then  using  p(x,  v)  and  MIHlthe 
theory  of  derived  distributions,  one  obtains 

p(x,  z)  «  p(x,  V  .  g*(x,  z))  |J  (2) 

where 
J  ,  det  : 

a  z 

b)  Evaluate  d(z/x).  This  conditional  density  func¬ 
tion  can  always  be  obtained  either  analytically 
whenever  possible  or  experimentally  from  the 

z  «  g(x,  v)  and  p(x,  v). 

Note  that  (ila)  may  be  difficult  to  obtain  in  general 
since  g*  may  not  exist  either  because  of  th  non¬ 
linear  nature  of  g  or  that  z,  v  are  of  different  dim¬ 
ensions.  Nevertheless,  (lib)  can  always  be  carried 
out.  This  fact  will  be  demonstrated  in  the  nonlinear 
example  in  the  sequel. 

(ill)  Evaluate  p(x/z)  using  the  following  relationships, 
a)  Following  (11a) 

p(x/z)  -  (3) 

p(z) 

l>)  Following  (iib),  use  the  Bayes'  rule 

ptx/z).^^^-^)  (4) 

pU) 

Depending  on  the  class  of  distributions  one  has 
assumed  or  obtained  for  p(x,  v),  p(z),  p(t/x)  this 
key  step  may  be  easy  or  difficult  to  carry  out. 
Several  classes  of  distribution  which  have  nice 
properties  for  this  purpose  can  be  found  Ini.  The 
density  function  p (x/ z)  is  known  as  the  aposterlorl 
density  function  of  x.  It  is  the  knowledge  abouflhe 
state  of  nature  after  the  measurements  z.  By  def¬ 
inition,  it  contains  all  the  information  necessary 
for  estimation. 

(iv)  Depending  on  the  criterion  function  for  estimation 
one  can  compute  estimate  ft  from  p(x/z).  Some 
typical  examples  are 

a)  Criterion:  Maximize  the  Probability  (ft  ■  x) 
Solution  ft  ■  Mode  of  p(x/z)  (5) 

This  is  defined  as  the  Most  Probable  Estimate 


3EJSI0H  xrv,  rAPB*  2 
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When  the  aprtori  density  (unction  p<x)  Is  uni¬ 
form,  this  estimate  Is  Identical  to  the  classi¬ 
cal  maximum  likelihood  estimate. 

b)  Criterion:  Minimize  J  ||x-ft  ||2  p(x/z)  dx 

Solution  :  ft  •  E(x,  z)*  (8) 

This  Is  the  conditional  mean  estimate. 

c)  Criterion  Minimize  Maximum  x  -  I’ 

Solution  ft  •  Medium  of  p(x/z)  (7) 

This  can  be  defined  as  the  mint  max  estimate. 

Pictorially,  the  three  estimates  are  shown  In 

Fig.  1  for  a  general  p(x/z)  for  a  scalar  case 


xc  -  Minlmax  estimate 

Fig.  1  Estimates  based  on  aposteriort  density 

Clearly,  other  estimates,  as  well  as  confidence  Intervals 
can  be  derived  from  p(x/z)  directly. 

Special  Case  of  the  Wiener -Kalman  Filter  (single  stage) 

Now  a  special  case  of  the  above  estimation  problem  will 
be  considered.  Let  there  be  given 

(I)  A  set  of  measurements  z  »  (zj,  z2 . z^) 

(II)  The  physical  relationship 

z  *  Hx  ♦  v  (8) 

(ill)  The  Independent  noise  and  state  density  functions 

p(x,  v)  •  p(x)  p(v)  (9) 


|00  be  gausstan  with 


Cov(x)  •  P„ 


) 


(10) 


p(v)  be  gausstan  with 

E(v)-0  i  (11) 

Cov(v)  *  R 

Now  following  the  steps  for  the  Bayesian  solution  one  has 
(1)  Evaluate  p(z). 

Since  z  *  Hx  ♦  v  and  x,  v  Is  gausslan  and  Indepen¬ 
dent,  one  Immediately  gets 


p(z)  Is  gausslan 
E(z)  -  HT  - 
Cov(z)  »  H  PQ  H 


R  J 


(12) 


(Ita)  Evaluate  p(x,  z). 

Since  (*  £  )  «  Identify  matrix,  it  follows 
»  z 

p(x,  z)  p(x,  v  »  z  -  Hx)  (13) 

•  p(x)  Py(z  -  Hx)** 

(lib)  Evaluate  p(z/x)* 


p(z  X)  <  =  p(v)  -  p  (z  -  Hx)  (14) 

p(x) 

(111)  Evaluate  p(x,  z). 

One  gets  from  Bayes'  rule, 

p(x,  z)  ,  (15) 

p(z) 

By  direct  substitution  of  (10),  (11),  and  (12),  one  obtains 
t  1 

|H  P  H*  *  Rl  2  ,  T  ,  T 

p(x  z)  * - ° ^  exp  j-  j  [ (x-T) 1  Po  l(x-t)T 

(2")'v2|Pol5iR'2 

♦  (z  -  Hx)T  R'1  (z  -  Hx) 

-  (z  -  H3T)T  (H  Pq  HT  *  R)*1  (z  -  H< )’; 


(16) 


Now  completing  squares  In  the  ]  *  ,  (16)  simplifies  to 

x  1 
I  H  P  H 1  «  R  |  J 


P(X/  z)  *  - - 


-11 1  exp  «  -  i  (x-ft  jV1  (x-ft)}  (17) 


where 

P’1  -  PQ*1  .  HT  R*1  H  (18) 

or  equivalently, 

P  »  PD  -  P0  «T  (H  PQ  HT  4  R)*1  H  PQ  (19) 

and 

x  »  ft  .  P  HT  R‘*(z  -  HT)  (20) 

(iv)  Now  since  p(x/z)  is  gausslan,  the  most  probable, 
conditional  mean,  and  minlmax  estimate  all  coincide 
and  is  given  by  ft. 

This  is  the  derivation  of  the  single  stage  Wiener- Kalman 
filter^.  3,  The  pair  (P,  ft)  is  called  a  sufficient  statistic 
for  the  problem  In  the  sense  that  p(x/z)  **  pTxT,  ft). 

MULTI-STAGE  ESTIMATION  PROBLEM 

The  problem  formulation  and  the  solution  In  this  case  Is 
basically  similar  to  the  single  stage  problem.  The  only 
additional  complication  is  that  now  the  state  is  changing  from 
stage  to  stage  according  to  some  dynami :  relationship  and 
that  the  aposteriort  density  function  Is  to  be  computed  recur¬ 
sively. 

Problem  Statement 


It  Is  assumed  that  at  any  stage  k*  1,  the  following  data 
Is  given  as  a  result  of  previous  computation  or  as  part  of  the 
problem  statement 


*Note.  step  (lib)  Is  redundant. 

**pv(z  -  Hx)  means  substl*utlng(z  -  H^for  v  In  p(v). 


•It  Is  assumed  that  p(x/z)  has  finite  second  moment. 
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(i)  The  system  equations  governing  the  evaluation  of 
the  state. 


Vi  *  f(V  wk> 
zk.i  =  h(Vr  Vi> 

where  x^  j  Is  the  state  vector  at  k  ♦  1. 


(21) 


v^  j  Is  the  measurement  noise  at  k  ♦  I. 

z.  ,  Is  the  additional  measurement  available 
K<1  at  k  ♦  1. 

wfc  Is  the  disturbance  vector  at  k. 

(II)  The  complete  set  of  measurements  Z.  .  * 

( 2  2  )  K+  1 

'Z1 . Vl1, 

(III)  The  density  functions* 

P<V  zl . zk*  *  P<xk/  Zk* 

p(w  v.  .  x.  )  -  statistics  of  a  vector  random 
sequence  with  components  w. 
and  vk  j  which  depends  on  x^. 


Now  It  Is  required  to  estimate  x,  .  based  on  measure¬ 
ments  Zj . zk«l‘  * 


The  Bayesian  Solution 


The  procedure  Is  analogous  to  the  single  stage  case. 


(I)  Evaluate  p(xk+  j  x^).  This  can  be  accomplished 

either  experimentally  or  analytically  from  know¬ 
ledge  of  p(wk,  vk<j/xk),  pfxj^  Zk)  and  (21). 

(II)  Evaluate  p(zk  l  x^  x^^  j).  This  is  derived  from 

P(wk*  Vl/Xk,and  (21)- 

(III)  Evaluate 

p(Vl'  zk.l  ZkJ  ‘  „  p(zk.l  ^k’  Vl’ 

P(Vl/xk)  •  p(VZk,dxk  (22) 


From  this  the  marginal  density  functions  p(x^  j/Zk) 
and  pU^j  Zk)  can  be  directly  evaluated. 


(Iv)  Evaluate 

p(Vi/zk.i) 


P(Vi>  WV 

p(zk*l/Zk) 


(23) 


from  (22) 

*  j  p(Zk>1  Vl)p(Vr  xk)  p(xk  Zk* 

J/p(zk-l /Zk'  Vl’P'Vl  V  p(Y  V  “Vl  ^ 


Eqn.  (24)  Is  a  functional  integral  difference  equation  govern¬ 
ing  the  evaluation  of  the  aposterlorl  density  function  of  the 
state  of  (21). 

(v)  Estimates  for  x^j  can  now  be  obtained  from 

p(xk  j  Zk  l)  exactly  as  In  the  single  stage  case. 


•The  product  of  the  two  density  functions  yields 
p(w  yk>  j,  x^  Zk)  by  the  markov  property  of  (21). 

It  is  also  assumed  that  If  p(w,  v  x)  -  p(w,  v)  then 
w,  v  Is  a  white  random  sequence. 


Special  Case  of  the  Wiener- Kalman  Filter* 

The  given  data  at  k  ♦  1  is  specified  as  follows 
The  physical  model  is  given  by 


v.»»vr"k  }  (25) 

*k  *  H  *k  *  vk 

where  w  and  v  are  Independent,  white,  gausslan  random 
sequences  with 

p(*k  l“  Gau,slan 

E<V  V  *  ‘k  I  <»> 

Cov  (V  V  *  pk 


p(V  Vi  Y  zk)  -  p(V  Vi}*  P(wk)  P(Vi> 

E(vk)  «  E(vktl)  «  0  j  (27) 

Cov  (wk)  «  Q,  Cov  (vk  j)  *  R 


Since  In  this  case,  the  noise  wk,  vfc  .  Is  not  dependent  on 
the  state,  Eqn.  (24)  simplifies  to 


p(Yl  ^1* 


p(zk*l  YV 

p(Yl'V 


p(Vl/Zk) 


(24)’ 


Hence  the  solution  only  Involved  the  evaluation  of  the  three 
density  functions  on  the  r.h.  s.  of  (24)'  given  the  data  (25  - 
27).  This  Is  carried  out  below: 


From  (27),  It  Is  noted  that  p(x.  ./  Z,  )  Is  gausslan  and  In¬ 
dependent  of  vk+  j  a*  l  x 


E(xk.l/Zk)  '  ’  *k 

CovOWV-’  VT-r«rT-Mk*i 


(28) 


Similarly,  pfr^j/Z^)  ls  gausslan  and 

«WV-H,ak 

Cov(Yi/zk,=  HMk*i  hT‘r 

Finally  pfr^j/x^j)  *8  also  gausslan  with 

E(W\.i,“  H  Vi 

Cov(Yi  Vi),R 


(29) 


(30) 


•Footnote  added  In  proof. 

This  development  of  the  multistage  Wlener-Kalman  filter¬ 
ing  method  Is  very  similar  to  a  paper  by  Drs.  H.  Rauch, 

F.  Tung,  and  C.  Strtebel  entitled  '’On  The  Maximum  Like¬ 
lihood  Estimate  for  Ltnear  Dynamic  Systems"  presented  at 
the  SIAM  Conference  on  System  Optimization,  1964, 
Monterey,  California.  The  only  difference  between  the  two 
developments  is  this  The  Rauch-Tung-Strlebel  paper  does 
not  explicitly  compute  p(x,  z)  but  simply  computes  Its  maxi¬ 
mum  and  uses  It  as  the  estimate.  In  the  author's  approach, 
the  computation  of  the  maximum  plays  a  secondary  role. 

The  explicit  calculation  of  the  posteriori  probability  is  em  ¬ 
phasized  as  the  Bayesian  viewpoint. 

The  authors  are  indebted  to  Prof.  A.  E.  Bryson  for  bring¬ 
ing  this  reference  to  their  attention. 
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(38) 


Combining  (28*30)  tiling  (24)',  one  geta 


p(Vl 

t  1 
IH  Mk>1  Hx  ♦  R|2 

(2-)"^  2  I R!  2  I  I  2 


(cloud  notae)  be  a  acalar  Markov  proceaa  wth, 
p(nj)  «  (1-a)  Mnj)  ♦  a  Ml  - 

p*"kW  -  (1-a  -  ^)  6  (nk+1)  ♦  (a*  ^)  4  (1-n^j)  (39) 

m  • 

and  the  acalar  meaaurement, 


exp  '  -  I  (\+1  -  *  VT  “k^l  (xk*l  '  *  V 

+  (tk*l  ’  HVl)T  R  1  (tk*l  *  HVl> 

*  (\*\  - H  *  VT  (H  “k^i  hT  *  Rrl 

(Vi -«•*,»}  <3l> 

Now  completing  squares  It  one  geta, 

t  1 

|H  .  H  +  Ri  2  ,  -  T 

p(Vi/Zk*i) - 77r~T. - ieM-»/2(Vi-Vi) 


(2")n/2|Rl|lMk+1|i 


pk!l  (Vl  '  *k-l^ 


(32) 


where 


\.l  •  ' •  “k.l  "T  <HMk.l  HT.R|-' 


*»  •  *»©"»  (40) 

where  Q  Indicate!  the  logical  "OR"  operation. 

Eaaenttally  Eqna.  (37-40)  Indicate  the  (act  that  aa  the 
detector  aweepa  acroaa  the  field  of  view,  cloud  reflection 
tenda  to  appear  In  groups  while  targeta  appear  In  taolated 
dots. 


Now  we  proceed  to  the  Bayealan  aolutlon.  Flrat,  we 
have, 


nl 

0 

0 

1 

1 

gl 

0 

1 

0 

1 

Z1 

0 

1 

1 

1 

Probabil¬ 
ity  Of  Zj 

(l-a)(l-q) 

q(l-a) 

a(l-q) 

aq 

p(ij)  -  (1-a)  (1-q)  4  (Zj)  ♦  (a  ♦  q  -  aq)  4  (Zj  - 1)  (41) 


«*k.i  - H  •  V 

pk!i  ■  <, 

or  equivalently 

pk.1-Mk.i-Mv.iH'r«H  “k.i"7 

and 

Mk<i-*Pk*T*rQrT 


(33) 

Also, 

(34) 

p(Zj/nj)  ■  4  (Zj  -  1)  n} 

-  '(1-q)  4  dj)  ♦  q  4  (Zj-l)j  (1  -  nj) 

(42) 

(35) 

Then  by  direct  calculation, 

(36) 

p(z./n. )  p(n. ) 
p(n«/ z. )  *  -  1  1  -  1 

1  1  P(lj) 

(43) 

Eqna.  (33-38)  are  exactly  the  dlacrete  Wiener- Kalman 
filter  In  the  multiatage  caae  [3)  [4], 


«  (1-a'  (Zj))  4  (nj)  ♦  a'  (Zj)  1  (nj-1) 


A  SIM 


MPLE  NONLINEAR  NONGAUSS  IAN 

ESTIMATION  PROBLEM 


The  dlacuaalona  In  the  above  aectlona  have  been 
carried  out  In  terms  of  continuous  density  functions.  How¬ 
ever.  It  Is  obvious  that  the  same  process  can  be  applied  to 

Broblema  Involving  dlacrete  density  function  and  discon- 
nuous  functional  relationships.  It  la  worthwhile,  at  this 
point,  to  carry  out  one  such  aolutlon  for  a  simple  contrived 
example  which  nevertheless  illustrates  the  application  of  the 
basic  approach. 

The  problem  can  be  visualized  as  an  abstraction  of  the 
following  physical  estimation  problem.  An  Infrared  detector 
followed  by  a  threshold  device  Is  used  In  a  satellite  to  detect 
hot  targets  on  the  ground.  However,  extraneous  signals, 
particularly  reflection  from  clouds,  obscure  the  measure¬ 
ments.  The  problem  Is  to  design  a  multistage  estimation 
process  to  estimate  the  presence  of  hot  targets  on  the 
ground  through  measurement  of  the  output  of  the  threshold 
detector. 

Let  s^  (target)  be  a  scalar  Independent  Bernoulli  pro¬ 
cess  with, 


where 


a'(z, ) 


a  !  (Zj  -  1) 


(44) 


1  (1-a)  (1-q)  5  (z^  ♦  (a*q  -  aq)  6  (zj-1) 

Similarly, 

p(Zj/Sj)  «  S  (Zj-1)  Sj  ♦  r (1-a)  6  (Zj)+  a5  (r1-l)](l-s1)(45) 
and 


p(Sj/Zj) 


pUj/Sj)  p(Sj) 

P(zj) 

(1  -  q'  (Zj))  5  (s,)  *  q'  (Zj)  4  (8j  -  1)  (46y 


where 

Q  Mz,  -  1) 

q’(z.) - 1 -  (47) 

(l-a)(l-q)  4  (Zj)  ♦  (a*q  -  aq)  4  (Zj-1) 

and  a  reasonable  estimate  Is 


p(sk)  •  (1  -  q)  4  (sk)  •  q  4  (1  -  sk)  (37)* 

1  x  «  0 

•The  notation  4(x)  *  'q  x  ^  0  u,ed  h*re-  p(*) 

Is  to  be  Interpreted  as  mass  functions. 


1  If  q’(Zj)  *  c  (Given  constant) 
*1  0  if  q’(Zj)  <  t 


where  ij  *  1  may  be  Interpreted  as  an  alarm. 


(48) 
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Now  consider  a  second  measurement  z2  has  been  made.  One 
has, 


oo 

p(n2//ij)*  ptoj/nj)  p(nj/ij)  dnj 


(49) 


which  after  straightforward  but  somewhat  laborious  manip¬ 
ulations  becomes, 


■  (1-a 


2 


)  &  (n2)  ♦  ( 


a'U,) 

~2~ 


a)  6  (n2  -  1) 


*  (1-a  (Zj))  4  (n2)  4  a(ij)  6  (n2  -  1) 


Furthermore, 

ptsj/ij)  «  p(s2)  «  (1  -  q)  4  ( s 2 )  *  q  6  (Sj  -  1)  (50) 

Eqns.  (49)  and  (50)  now  take  the  place  of  (97)  and  (38)  and  by 
the  same  process,  one  can  get  In  general, 

pW  ‘  pV\*  *k-i'  •••> 

(51) 

.  (1-a'  (Z^)  s  (i^)  4  a’  (Z^  4(^-1) 


a'(Zk)  ’  a'(*k'  *k-l‘ 

_ a(Zk-l>6<*k-l> _ 

(l-a(Zk_1))(l-q)  ft  (z^4  (a(Zk,1)4q.a(Zk_1)q)fi(tk-l) 

(52) 


i(Zk-l>  =  a(zk-l  ^-2- 

a’(Zk-l) 

=  a  ♦ - — — 

2 

P(VZk>  *  p(V  •••) 

=  ( 1  — q*  (Zfc))  4  (sk)  4  4  (sk-l) 

<»,(zk)  =  (V  *k-i»  •••> 


(53) 


(54) 


q  4  (zk  - 1) 

(T -A? l  -q)  4  (^J  4  (a(Zk  l )  *  ,-a  (^  j)q)  4(^-1) 


(55) 


p(Vi/V-(i-(Vi‘i»k..»-»<V,'Vi  - 11  <5«> 

p(*k.l  Zk*  "  p*8k.  I  *  ^57) 


Eqns.  (51-57)  now  represent  the  general  recursion  solution 
for  the  multistage  estimation  process. 

As  a  check,  two  possible  observed  sequences  for  z, 
namely  (0, 1)  and  (1, 1)  are  considered.  With  a  *  1/4  and 
q  =  1/4  It  Is  found  that  pfs^  Zj,  Zj )  «  0.571  and  0.  337 

respectively.  This  agrees  with  intuition  since  the  sequence 
(1,1)  has  a  higher  probability  of  being  cloud  reflections. 

On  the  other  hand,  the  numbers  also  showed  that  under  the 
circumstances,  It  Is  very  difficult  to  detect  targets  with 
accuracy  using  the  system  contrived  here. 

Often  times  one  is  actually  interested  in  p^/Z^) 

with  ■  >  0  Ln  order  to  obtain  the  so-called  "smoothed" 
estimate  for  sk>  The  desired  density  function  ran  be 

computed  from  p(sk  Z^  by  further  manipulations.  How¬ 
ever,  the  calculation  becomes  Involved  and  will  not  be  done 
here. 


It  Is  worthwhile  to  point  out  the  relationship  of  the  above 
formulation  and  solution  of  the  estimation  problem  to  and  *.ts 
difference  from  the  general  statistical  decision  problem.  For 
simplicity,  the  single  stage  case  Is  considered  again.  In  (he 
general  statistical  decision  problem,  the  input  data  is  some¬ 
what  different.  One  typical  form  is:* 

p(x)  -  aprlorl  density  of  x 

(ef  -  a  set  of  choices  of  experiments  from  which 
we  can  derive  measurements  z  with 

p(z/x,  e)-condlttonal  density  of  z  for  given  x  and  e. 
u  “  -  a  set  of  choices  of  decisions 

J(e,  z,  u,  x)-  a  criterion  function  which  Is  a  possible 
function  of  e,  z,  u  and  x. 

The  problem  Is  then  stated  as  the  determination  of  e  and 
u  so  that  E(J)  Is  optimized.  The  optimal  J  Is  given  by 

J o  t  *  Max  (Min)  j  Max  (Mln)r  J(e,  z,u,  x)  . 

e  “  u  (58)** 

.p(x/z,  e)  dx]}  p(z/e)  dz 

Thus,  the  main  differences  between  the  estimation  problem 
and  the  general  decision  problem  are  as  follows: 


(I)  In  the  estimation  problem  there  is  no  choice  of 
experiment.  One  always  makes  the  same  type  of 
measurement  z  given  by  g(x,  v).  To  generalize 
the  estimation  problem,  one  can  specify: 

z  -  g  (x, v);jef  «  1,2,...  «  possible  sets  of 

measurements  (59) 

and  then  require  that 

ft  *  Opt  ;  (ft)  e  •  1,  2,  . . . 
e 

(it)  In  the  general  decision  problem,  the  function 
z  >  g(x,  v)  Is  implicit  In  p(z/x,  e).  Hence  step 
(iia)  and  (lib)  for  the  estimation  solution  Is  not 
required.  This  often  Is  a  tremendous  simplifi¬ 
cation. 

(ill)  ln  the  estimation  problem  the  criterion  function  J 
is  always  a  simple  function  of  x  only.  There  Is, 
furthermore,  no  choice  of  action  (one  has  to  make 
an  estimate  by  definition).  On  the  other  hand,  the 
general  decision  problem  Is  mo!*6  analogous  to  a 
combined  estimation  and  control  problem  where 
one  has  a  further  choice  of  action  after  determining 
p(x/z),  and  like  a  control  problem,  the  criteria 
I anctlon  Is  generally  more  complex. 

(tv)  It  Is,  however,  to  be  noted  that  the  key  step  Is  the 
computation  of  p(x/z)  for  both  problems.  The 
choice  of  action  Is  determined  only  after  the  com¬ 
putation  of  p(x/z).  Thus,  a  general  decision  prob¬ 
lem  can  be  decomposed  Into  two  problems,  namely, 
determination  of  p(x/z)  (estimation  problem)  and 
choice  of  action  (control  problem).  In  control- 
theoretic  technology,  this  fact  is  called  the  Gener¬ 
alized  Decomposition  Axiom. 

As  an  example,  consider  the  single  stage  Wlener-Kal- 
man  problem.  The  added  requirement  that, 

J(e,  z,  u,  x)  -  J  (u,  x)  »  E  ||  Bx  4  u  II  2  (60) 

»  j  1 1  Bx  4  u  1 1 2  p(x  z)  dx 


•For  other  equivalent  forms  see  rl] 
••See  Cl] 
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CONCLUSION 


In  the  above  sections,  the  problem  of  estimation  from 
the  Bayesian  viewpoint  Is  discussed.  It  Is  the  author's 
thesis  that  this  approach  offers  a  unliving  methodology,  at 
least  conceptually,  to  the  general  problems  of  estimation 
and  control. 

The  apoateriori  conditional  density  function  p(x/z)  is 
seen  to  be  the  key  to  the  solution  of  the  general  problem. 
Difficulties  associated  with  the  solution  of  the  general  prob¬ 
lem  now  appears  more  specifically  as  difficulties  in  steps 
leading  to  the  computation  of  p(x/z).  From  the  above  dis¬ 
cussions,  It  Is  relatively  obvious  that  these  difficulties  are: 

(I)  Computation  of  p(z/x)  • 

In  both  the  single  stage  or  multistage  case,  this 
problem  Is  complicated  by  the  nonlinear  function¬ 
al  relationships  between  z  and  x.  Except  In  the 
case  when  z  and  x  are  linearly  related  or  when  z 
and  x  are  scalars,  very  little  can  be  done  in 
general,  analytically  or  experimentally.  As  was 
mentioned  earlier,  this  difficulty  does  not  appear 
In  the  usual  decision  problem,  since  there  It  is 
assumed  that  p(z/x)  is  given  as  part  of  the  prob¬ 
lem. 

(II)  Requirement  that  p(x/z)  be  In  analytical  form. 

This  Is  an  obvious  requirement  If  we  Intend  to 
use  the  solution  In  real-time  applications.  It 
will  not  be  feasible  to  compute  p(x/z)  after  z  has 
occurred. 

(III)  Requirement  that  p(x),  p(z),  p(x/z)  be  conjugate 
distributions.  * 

This  Is  stmply  the  requirement  that  p(x)  and 
p(x/z)  be  density  functions  from  the  same  family. 
Note  that  all  the  examples  discussed  in  this  paper 
possess  this  desirable  property.  This  is  pre¬ 
cisely  the  reason  that  multistage  computation  can 
be  done  efficiently.  This  imposed  a  further  re¬ 
striction  on  the  functions  g,  f  and  h. 

The  difficulties  (1  -  ill)  listed  above  are  formidable 
ones.  It  is  not  likely  that  they  can  be  easily  circumvented 
except  for  special  classes  of  problems,  such  as  those  dis¬ 
cussed.  However,  it  is  worthwhile  first  to  pinpoint  these 
difficulties.  Researches  toward  their  solution  can  then  be 
effectively  initiated.  Finally,  it  is  felt  that  the  Bayesian 
approach  offers  a  unified  and  inutltive  viewpoint  particu¬ 
larly  adaptable  to  handle  modern  day  control  problems 
where  the  state  and  the  Markov  assumptions  play  a  funda¬ 
mental  role. 


•See  [  1 ] 
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